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52 Human Secreted Proteins 
-F/eA/ <?/ the Invention 

This invention relates to newly identified polynucleotides, polypeptides 
encoded by these polynucleotides, antibodies that bind these polypeptides, uses of 
such polynucleotides, polypeptides, and antibodies, and their production! 

Background of the Invention 
Unlike bacterium, which exist as a single compartment surrounded by a 
membrane, human cells and other eucaryotes are subdivided by membranes into many 
functionally distinct compartments. Each membrane-bounded compartment, or 
organelle, contains different proteins essential for the function of the organelle. The 
cell uses "sorting signals," which are amino acid motifs located within the protein, to 
target proteins to particular cellular organelles. 

One type of sorting signal, called a signal sequence, a signal peptide, or a 
leader sequence, directs a class of proteins to an organelle called the endoplasmic 
1 5 reticulum (ER). The ER separates the membrane-bounded proteins from all other 
types of proteins. Once localized to the ER, both groups of proteins can be further 
directed to another organelle called the Golgi apparatus. Here, the Golgi distributes 
the proteins to vesicles, including secretory vesicles, the cell membrane, lysosomes, 
and the other organelles. 
20 Proteins targeted to the ER by a signal sequence can be released into the 

extracellular space as a secreted protein. For example, vesicles containing secreted 
proteins can fiise with the cell membrane and release their contents into the 
extracellular space - a process called exocytosis. Exocytosis can occur constitutively 
or after receipt of a triggering signal. In the latter case, the proteins are stored in 
secretory vesicles (or secretory granules) until exocytosis is triggered. Similarly, 
proteins residing on the cell membrane can also be secreted into the extracellular 
space by proteolytic cleavage of a "linker" holding the protein to the membrane. 

Despite the great progress made in recent years, only a small number of genes 
encoding human secreted proteins have been identified. These secreted proteins 
include the commercially valuable human insulin, interferon, Factor VIII, human 
growth hormone, tissue plasminogen activator, and erythropoeitin. Thus, in light of 
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the pervasive role of secreted proteins in human physiology, a need exists for 
identifying and characterizing novel human secreted proteins and the genes that 
encode them. This knowledge will allow one to detect, to treat, and to prevent 
medical diseases, disorders, and/or conditions by using secreted proteins or the genes 
5 that encode them. 



Summary of the Invention 

The present invention relates to novel polynucleotides and the encoded 
polypeptides. Moreover, the present invention relates to vectors, host cells, 

10 antibodies, and recombinant and synthetic methods for producing the polypeptides 
and polynucleotides. Also provided are diagnostic methods for detecting diseases, 
disorders, and/or conditions related to the polypeptides and polynucleotides, and 
therapeutic methods for treating such diseases, disorders, and/or conditions. The 
invention further relates to screening methods for identifying binding partners of the 

15 polypeptides. 

Detailed Description 

Definitions 

The following definitions are provided to facilitate understanding of certain 
20 terms used throughout this specification. 

In the present invention, "isolated" refers to material removed from its original 
environment (e.g., the natural environment if it is naturally occurring), and thus is 
altered "by the hand of man" from its natural state. For example, an isolated 
polynucleotide could be part of a vector or a composition of matter, or could be 
25 contained within a cell, and still be "isolated" because that vector, composition of 
matter, or particular cell is not the original environment of the polynucleotide. The 
term "isolated" does not refer to genomic or cDNA libraries, whole cell total or 
mRNA preparations, genomic DNA preparations (including those separated by 
electrophoresis and transferred onto blots), sheared whole cell genomic DNA 
30 preparations or other compositions where the art demonstrates no distinguishing 
features of the polynucleotide/sequences of the present invention. 
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In the present invention, a "secreted" protein refers to those proteins capable 
of being directed to the ER, secretory vesicles, or the extracellular space as a result of 
a signal sequence, as well as those proteins released into the extracellular space 
5 without necessarily containing a signal sequence. If the secreted protein is released 
into the extracellular space, the secreted protein can undergo extracellular processing 
to produce a "mature" protein. Release into the extracellular space can occur by many 
mechanisms, including exocytosis and proteolytic cleavage. 

In specific embodiments, the polynucleotides of the invention are at least 15, 
10 at least 30, at least 50^ at least 100, at least 125, at least 500, or at least 1000 

continuous nucleotides but are less than or equal to 300 kb, 200 kb, 100 kb, 50 kb, 15 
kb, 10 kb, 7.5 kb, 5 kb, 2.5 kb, 2.0 kb, or 1 kb, in length. In a further embodiment, 
polynucleotides of the invention comprise a portion of the coding sequences, as 
disclosed herein, but do not comprise all or a portion of any intron. In another 
1 5 embodiment, the polynucleotides comprising coding sequences do not contain coding 
sequences of a genomic flanking gene (i.e., 5' or 3' to the gene of interest in the 
genome). In other embodiments, the polynucleotides of the invention do not contain 
the coding sequence of more than 1000, 500, 250, 1 00, 50, 25, 20, 1 5, 1 0, 5, 4, 3, 2, or 
1 genomic flanking gene(s). 
20 As used herein, a "polynucleotide" refers to a molecule having a nucleic acid 

sequence contained in SEQ ID NO:X or the cDNA contained within the clone 
deposited with the ATCC. For example, the polynucleotide can contain the 
nucleotide sequence of the full length cDNA sequence, including the 5* and 3' 
untranslated sequences, the coding region, with or without the signal sequence, the 
25 secreted protein coding region, as well as fragments, epitopes, domains, and variants 
of the nucleic acid sequence. Moreover, as used herein, a "polypeptide" refers to a 
molecule having the translated amino acid sequence generated from the 
polynucleotide as broadly defined. 

In the present invention, the full length sequence identified as SEQ ID NO:X 
30 was often generated by overlapping sequences contained in multiple clones (contig 
analysis). A representative clone containing all or most of the sequence for SEQ ID 
NO:X was deposited with the American Type Culture Collection ("ATCC"). As 
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shown in Table 1 , each clone is identified by a cDNA Clone ID (Identifier) and the 
ATCC Deposit Number. The ATCC is located at 10801 University Boulevard, 
Manassas, Virginia 201 10-2209, USA. The ATCC deposit was made pursuant to the 
terms of the Budapest Treaty on the international recognition of the deposit of 
5 microorganisms for purposes o f patent procedure. 

A "polynucleotide" of the present invention also includes those 
polynucleotides capable of hybridizing, under stringent hybridization conditions, to 
sequences contained in SEQ ED NO:X, the complement thereof, or the cDNA within 
the clone deposited with the ATCC. "Stringent hybridization conditions" refers to an 
10 overnight incubation at 42 degree C in a solution comprising 50% formamide, 5x SSC 
(750 mM NaCl, 75 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5x 
Denhardt r s solution, 10% dextran sulfate, and 20 ng/ml denatured, sheared salmon 
sperm DNA, followed by washing the filters in O.lx SSC at about 65 degree C. 
Also contemplated are nucleic acid molecules that hybridize to the 
1 5 polynucleotides of the present invention at lower stringency hybridization conditions. 
Changes in the stringency of hybridization and signal detection are primarily 
accomplished through the manipulation of formamide concentrationXlower 
percentages of formamide result in lowered stringency); salt conditions, or 
temperature. For example, lower stringency conditions include an overnight 
20 incubation at 37 degree C in a solution comprising 6X SSPE (2 OX SSPE = 3M NaCl; 
0.2M NaH 2 P0 4 ; 0.02M EDTA, pH 7.4), 0.5% SDS, 30% formamide, 100 ug/ml 
salmon sperm blocking DNA; followed by washes at 50 degree C with 1XSSPE, 
0.1% SDS. In addition, to achieve even lower stringency, washes performed 
following stringent hybridization can be done at higher salt concentrations (e.g. 5X 
25 SSC). 

Note that variations in the above conditions may be accomplished through the 
inclusion and/or substitution of alternate blocking reagents used to suppress 
background in hybridization experiments. Typical blocking reagents include 
Denhardfs reagent, BLOTTO, heparin, denatured salmon sperm DNA, and 
30 commercially available proprietary formulations. The inclusion of specific blocking 
reagents may require modification of the hybridization conditions described above, 
due to problems with compatibility. 
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Of course, a polynucleotide which hybridizes only to polyA+ sequences (such 
as any 3' terminal polyA+ tract of a cDNA shown in the sequence listing), or to a 
complementary stretch of T (or U) residues, would not be included in the definition of 
"polynucleotide," since such a polynucleotide would hybridize to any nucleic acid 
5 molecule containing a poly (A) stretch or the complement thereof (e.g., practically 
any double-stranded cDNA clone generated using oligo dT as a primer). 

The polynucleotide of the present invention can be composed of any 
polyribonucleotide or polydeoxribonucleotide, which may be unmodified RNA or 
DNA or modified RNA or DNA. For example, polynucleotides can be composed of 
1 0 single- and double-stranded DNA, DNA that is a mixture of single- and double- 
stranded regions, single- and double-stranded RNA, and RNA that is mixture of 
single- and double-stranded regions, hybrid molecules comprising DNA and RNA 
that may be single-stranded or, more typically, double-stranded or a mixture of single- 
and double-stranded regions. In addition, the polynucleotide can be composed of 
1 5 triple-stranded regions comprising RNA or DNA or both RNA and DNA. A 
polynucleotide may also contain one or more modified bases or DNA or RNA 
backbones modified for stability or for other reasons. "Modified" bases include, for 
example, tritylated bases and unusual bases such as inosine. A variety of 
modifications can be made to DNA and RNA; thus, "polynucleotide" embraces 
20 chemically, enzymatically, or metabolically modified forms. 

The polypeptide of the present invention can be composed of amino acids 
joined to each other by peptide bonds or modified peptide bonds, i.e., peptide 
isosteres, and may contain amino acids other than the 20 gene-encoded amino acids. 
The polypeptides may be modified by either natural processes, such as 
25 posttranslational processing, or by chemical modification techniques which are well 
known in the art. Such modifications are well described in basic texts and in more 
detailed monographs, as well as in a voluminous research literature. Modifications 
can occur anywhere in a polypeptide, including the peptide backbone, the amino acid 
side-chains and the amino or carboxyl termini. It will be appreciated that the same 
30 type of modification may be present in the same or varying degrees at several sites in 
a given polypeptide. Also, a given polypeptide may contain many types of 
modifications. Polypeptides may be branched , for example, as a result of 
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ubiquitination, and they may be cyclic, with or without branching. Cyclic, branched, 
and branched cyclic polypeptides may result from posttranslation natural processes or 
may be made by synthetic methods. Modifications include acetylation, acylation, 
ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a 
5 heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent 
attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, 
cross-linking, cyclization, disulfide bond formation, demethylation, formation of 
covalent cross-links, formation of cysteine, formation of pyroglutamate, formylation, 
gamma-carboxylation, glycosylation, GPI anchor formation, hydroxylation, 
10 iodination, methylation, myristoylation, oxidation, pegylation, proteolytic processing, 
phosphorylation, prenylation, racemization, selenoylation, sulfation, transfer-RNA 
mediated addition of amino acids to proteins such as arginylation, and ubiquitination. 
(See, for instance, PROTEINS - STRUCTURE AND MOLECULAR PROPERTIES, 
2nd Ed., T. E. Creighton, W. H. Freeman and Company, New York (1993); 
1 5 POS TTRANSL ATTONAL COVALENT MODIFICATION OF PROTEINS, B. C. 
Johnson, Ed., Academic Press, New York, pgs. 1-12 (1983); Seifter et al., Meth 
Enzymol 182:626-646 (1990); Rattan et al., Ann NY Acad Sci 663:48-62 (1992).) 

"SEQ ID NO:X" refers to a polynucleotide sequence while "SEQ ID NO: Y" 
refers to a polypeptide sequence, both sequences identified by an integer specified in 
20 Table 1. 

"A polypeptide having biological activity" refers to polypeptides exhibiting 
activity similar, but not necessarily identical to, an activity of a polypeptide of the 
present invention, including mature forms, as measured in a particular biological 
assay, with or without dose dependency. In the case where dose dependency does 
25 exist, it need not be identical to that of the polypeptide, but rather substantially similar 
to the dose-dependence in a given activity as compared to the polypeptide of the 
present invention (i.e., the candidate polypeptide will exhibit greater activity or not 
more than about 25-fold less and, preferably, not more than about tenfold less 
activity, and most preferably, not more than about three-fold less activity relative to 
30 the polypeptide of the present invention.) 

Many proteins (and translated DNA sequences) contain regions where the 
amino acid composition is highly biased toward a small subset of the available 
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residues. For example, membrane spanning domains and signal peptides (which are 
also membrane spanning) typically contain long stretches where Leucine (L), Valine 
(V), Alanine (A), and Isoleucine (I) predominate. Poly-Adenosine tracts (polyA) at 
the end of cDNAs appear in forward translations as poly-Lysine (poly-K) and poly- 
5 Phenylalanine (poly-F) when the reverse complement is translated. These regions are 
often referred to as "low complexity" regions. 

Such regions can cause database similarity search programs such as BLAST to 
find high-scoring sequence matches that do not imply true homology. The problem is 
exacerbated by the fact that most weight matrices (used to score the alignments 
1 0 generated by BLAST) give a match between any of a group of hydrophobic amino 
acids (L,V and I) that are commonly found in certain low complexity regions almost 
as high a score as for exact matches. 

In order to compensate for this, BLASTX.2 (version 2.0a5MP-WashU) 
employs two filters ("seg" and "xnu") which "mask" the low complexity regions in a 
1 5 particular sequence. These filters parse the sequence for such regions, and create a 
new sequence in which the amino acids in the low complexity region have been 
replaced with the character "X". This is then used as the input sequence (sometimes 
referred to herein as "Query" and/or "Q") to the BLASTX program. While this 
regime helps to ensure that high-scoring matches represent true homology, there is a 
20 negative consequence in that the BLASTX program uses the query sequence that has 
been masked by the filters to draw alignments. 

Thus, a stretch of "X"s in an alignment shown in the following application 
does not necessarily indicate that either the underlying DNA sequence or the 
translated protein sequence is unknown or uncertain. Nor is the presence of such 
25 stretches meant to indicate that the sequence is identical or not identical to the 
sequence disclosed in the alignment of the present invention. Such stretches may 
simply indicate that the BLASTX program masked amino acids in that region due to 
the detection of a low complexity region, as defined above. In all cases, the reference 
sequence(s) (sometimes referred to herein as "Subject", "Sbjct", and/or "S") indicated 
30 in the specification, sequence table (Table 1 ), and/or the deposited clone is (are) the 
definitive embodiment(s) of the present invention, and should not be construed as 
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limiting the present invention to the partial sequence shown in an alignment, unless 
specifically noted otherwise herein. 



Polynucleotides and Polypeptides of the Invention 

FEATURES OF PROTEIN ENCODED BY GENE NO: 1 

The translation product of this gene shares sequence homology with env 
protein (see, e.g., Genbank accession number AAD34324.1 (AF108843); all 
references available through this accession are hereby incorporated by reference 
herein.), a protein with similarity to retroviral envelope glycoproteins. 

The polypeptide of this gene has been determined to have a transmembrane 
domain at about amino acid position 493 to about 509 of the amino acid sequence 
referenced in Table 1 for this gene. Moreover, a cytoplasmic tail encompassing from 
about amino acids 5 10 to about 563 of this protein has also been determined. Based 
upon these characteristics, it is believed that the protein product of this gene shares 
structural features to type la membrane proteins. 

This gene is expressed primarily in fetal tissues, placenta, fetal liver spleen, 
infant brain, and total fetus and to a lesser extent in tumors (poorly differentiated 
ovarian adenocarcinoma and endometrial tumor), human adult (K.Okubo) and PC3 
prostate cell line. 

Polynucleotides and polypeptides of the invention are useful as reagents for 
differential identification of the tissue(s) or cell type(s) present in a biological sample 
and for diagnosis of diseases and conditions which include but are not limited to: fetal 
development disorders, cancer and other proliferative disorders, particularly 
endometrial and ovarian cancer. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the endometrium and ovary, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues or cell 
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types (e.g., fetal, reproductive, cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
5 individual not having the disorder. 

Preferred polypeptides of the present invention comprise, or alternatively 
consist of, one or more immunogenic epitopes shown in SEQ ID NO: 83 as residues: 
Gln-88 to Lys-97, Glu-128 to Ser-133, Asn-1 66 to Pro- 175, Thr-191 to Asn-196, 
Asn-207 to Lys-212, Cys-232 to Gly-238, AIa-256 to Ala-263, Thr-268 to Thr-280, 
10 Pro-311 to Cys-317, Val-347 to Leu-362, Glu-396 to Leu-406, Pro-429 to Ala-436, 
Ala-464 to Lys-469, Arg-513 to Asn-520. Polynucleotides encoding said polypeptides 
are also encompassed by the invention. 

The tissue distribution and homology to retroviral envelope proteins indicates 
that polynucleotides and polypeptides corresponding to this gene would be useful for 
15 diagnosis, detection, prevention and/or treatment of cancer and other proliferative 
disorders, particularly of the endometrium and ovary. 

The tissue distribution in infant brain indicates the protein product of this 
clone would be useful for the detection, treatment, and/or prevention of 
neurodegenerative disease states, behavioral disorders, or inflammatory conditions. 
20 Representative uses are described in the "Regeneration" and "Hyperproliferative 

Disorders" sections below, in Example 1 1, 15, and 18, and elsewhere herein. Briefly, 
the uses include, but are not limited to the detection, treatment, and/or prevention of 
Alzheimer's Disease, Parkinson's Disease, Huntington's Disease, Tourette Syndrome, 
meningitis, encephalitis, demyelinating diseases, peripheral neuropathies, neoplasia, 
25 trauma, congenital malformations, spinal cord injuries, ischemia and infarction, 
aneurysms, hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder, depression, panic disorder, learning disabilities, ALS, 
psychoses, autism, and altered behaviors, including disorders in feeding, sleep 
patterns, balance, and perception. In addition, elevated expression of this gene 
30 product in regions of the brain indicates it plays a role in normal neural function. 
Potentially, this gene product is involved in synapse formation, neurotransmission, 
learning, cognition, homeostasis, or neuronal differentiation or survival. 
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The expression within fetal tissue and other cellular sources marked by 
proliferating cells indicates this protein may play a role in the regulation of cellular 
division, and may show utility in the diagnosis, treatment, and/or prevention of 
developmental diseases and disorders, including cancer, and other proliferative 
conditions. Representative uses are described in the "Hyperproliferative Disorders" 
and "Regeneration" sections below and elsewhere herein. Briefly, developmental 
tissues rely on decisions involving cell differentiation and/or apoptosis in pattern 
formation. Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
1 5 polynucleotides and polypeptides of the present invention are useful in treating, 

detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and would be useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
20 modulating the immune response to aberrant polypeptides, as may exist in 

proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
25 interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 1 1 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence 
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would be cumbersome. Accordingly, preferably excluded from the present invention 
are one or more polynucleotides comprising a nucleotide sequence described by the 
general formula of a-b, where a is any integer between 1 to 2205 of SEQ ID NO.l I , b 
is an integer of 1 5 to 22 19, where both a and b correspond to the positions of 
nucleotide residues shown in SEQ ID NO: 1 1, and where b is greater than or equal to a 
+ 14. 
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10 FEATURES OF PROTEIN ENCODED BY GENE NO: 2 

This gene shares sequence homology with members of the B7 family of 
ligands (i.e., B7-1 (See Genbank Accession 507873)). These proteins and their 
corresponding receptors play vital roles in the .growth, differentiation and death of T 
cells. For example, some members of this family (i.e., B7-H1) are involved in co- 
stimulation of the T cell response, as well as inducing increased cytokine production. 
Therefore, antagonists such as antibodies or small molecules directed against the 
translation product of this gene are useful for treating T cell mediated immune system 
disorders. 

In additional nonexclusive embodiments, polypeptides of the invention 
comprise, or alternatively consist of, the following amino acid sequence: 

LEVQVPEDPVVALVGTDATLCCSFSPEPGFSLAQLNLIWQLTDTKQLVHSFAE 
GQDQGSAYANRTALFPDLLAQGNASLRLQRVRVADEGSFTCFVSIRDFGSAA 
VSLQVAAPYSKPSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVFWQDGQGVP 
25 LTGNVTTSQMANEQGLFDVHSILRVVLGANGTYSCLVRNPVLQQDAHSSVTI 
TGQPMTF (SEQ ID NO: 158). Moreover, fragments and variants of these 
polypeptides (such as, for example, fragments as described herein, polypeptides at 
least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to these polypeptides 
and polypeptides encoded by the polynucleotide which hybridizes, under stringent 
conditions, to the polynucleotide encoding these polypeptides , or the complement 
there of are encompassed by the invention. Antibodies that bind polypeptides of the 
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invention are also encompassed by the invention. Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

Also preferred are polypeptides comprising, or alternatively consisting of, 
fragments of the mature extracellular portion of the protein demonstrating functional 
5 activity. Polynucleotides encoding these polypeptides are also encompassed by the 
invention. 

Such functional activities include, but are not limited to, biological activity 
(e.g., T cell costimulatory activity, ability to bind ICOS, and ability to induce or 
inhibit cytokine production), antigenicity, immunogenicity (ability to generate 
1 0 antibody which binds to a polypeptide of the invention), ability to form multimers 
with polypeptides of the invention, and ability to bind to a receptor or ligand for a 
polypeptide of the invention. 

Additionally, the translation product of this gene shares sequence homology 
with butyrophilin and butyrophilin-like molecules (See/e.g., Genbank Accession No. 
15 emb|CAB38473.1| (AL034394) dJ1077I5.1 and gb|AAC05288.1| (AF050157); in 
addition to the following Geneseq Accession Nos. W46488, W97816, W71592, and 
W78917; all information and references available through these accessions are hereby 
incorporated herein by reference): 



20 
25 
30 



gb|AAC05288.lf (AF0501S7) butyrophilin- like (Mus musculus) >sp I O7035 5 I 070355 
BUTYROPHILIN-LIKE (FRAGMENT). 1 
Length =4 52 

Plus Strand HSPs s 

Score - 255 (89.8 bite), Expect - 2.9e-23, Sum P(2) - 2 9e-23 
Identities - 80/292 (27%), Positives « 137/292 (46%), Frame - +1 

Query: 613 GPGDMVTITCSSY<^YPEAEVFKQDGQGVPL 7g2 

G G-f V ♦ C+S + PE EV W + G L ♦ ♦ ♦ + E GLF V L V 
Sbjct: 156 GEGE-VQLVCTSRGWFPEPEVHWEGIWGEKLM-SFSENHV^ 213 

Query: 793 GTYSCLVRNPVLQQDAHSSVTITPQ- RSPTGAVEVQVPEDPWALVGTDATLHCSFSPEP 969 
->C T SC + +■ L+ + + + ♦ ++ ♦ ++ +v V P VG ♦ L C SP+ 

U Shjct: 214 ETISCFIYSHGLRETQEATIALSERLQTELASVSVIGHSQPSPVQVGENIELTCHLSPQT 273 

Query: 970 GFSLTQLNLIWQLTDTKQLVHSFTEGR DQG S A YANRTALF PDLLAQGNAS LRLQRV 1137 

L + W + VH+G *Q Y RT-fL D + 4G +L+ + 

Sb 3 ct: 274 DAQNLEVR WLRS R YY P AVHVYANGTHVAG EQMVE YKG RTS L VTDA I HEG KLTLQ I HNA 331 

Query: 1138 RVADEGSFTCFVSIRD- - FGSAAVSLQVAAPYSKPSMTLEPNKDLRPGDTVTITCSSYRG 1311 

R +DEG + C +D ♦ A V +QV A S P *T E KD G + ♦ C + S 
Sbjct: 332 RTSDEGQYRCLFG - KIX5VYQEARVT5VQVMAVGSTPRI TREVLKD GG - MQLRCTSDGW 386 

45 Query: 1312 YPEAEVFWQDG^GVPLTGNVTTSQMANEQGLFDVTiS 1488 

♦ P V W+D G ♦ Q + + ♦ LF V * + L V G+ +C ♦ P+ Q+ 

Shjct: 387 FPRPHVQWRDRW3KTMPSFSEAFQQGSQE-LFQVETLLLVTNGSMVNVTCSISLPLGQE 444 
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Score a 194 (68.3 bits), Expect « 4.6e-ll, P =» 4.6e-ll 
Identities = 58/210 (27%) , Positives = 103/210 (49%) . Frame +1 

Query: 901 PED PWA LVGTD ATLH CS FS P E PG FSLTQLNLI WQLTDTKQLVHS FT EG R D - QG SAY 1068 

P P++A VG DA L C P+ ♦ + * W +D V + +G + G Y 
Sbjct: 34 PNLP I LAKVGEDALLTCQLLPKR - - TTAHMEVRWYRSDPDMPVI MYRDGAEVTGLPMEGY 91 

Query: 10S9 A1^TALFPDLI^C^NA5LRJ^RVRVADEX^PTCFVSIRDFC-SAAVSLQVAAPYSKPSMT 1245 

R D +G+ +L++++V+ +D+G ♦ C D+ +V LQVAA S P+ + 

Sbjct; 92 GGRAEWMEDSTEEGSVALKIRQVQPSDDGQYHCRFQEGDYWRETSVLLQVAALGSSPNIH 151 



Query; 124 6 I^PNKDLRPGDTVTITCSSYRGYPEAE^I^QTCQGVPLTGhr^TSQMANEQGLFDVHSVL 14 25 
+E L G+ V + C+S +PEEVW+ G L + + + + E GLF V L 

15 Sbjct: 152 VE GLGEX3E-VQLVCTSRGWFPEPEVTWEGIWGEKLM-SFSENHVPGEDGLFYVEDTL 206 



Query: 14 26 RWLGANGTYSCLVRNPVLQQDAHGSVTITGQPMT 1530 

V ♦ TSC+ + L+ + +♦ +♦ + T 

Sbjct: 207 MVRKDSVETISCFI YSHGLRETQEATIALSERLQT 241 

Score = 105 (37.0 bits), Expect = 0.24, P = 0.21 

Identities = 30/100 (30%), Positives = 44/100 (44%), Frame a +2 



25 Query: 2 54 PWALVGTDATLCCSFSPEPGFSLAQLNLIWQLTDTKQLVHSFAEGQ DQGSAYANR 421 

P VG ♦ L C SP + L + W ♦ VH+AG +Q YR 

Sbjct: 254 PSPVQVGEN I ELTCHLS PQT - - DAQNLEVRWLRS RYY PAVHVYANGTHVAG EQMVE YKGR 311 

Query: 4 22 TAI^LDLLAQGNASLRLQSVRVADEGQLHLLREHPGFRQRCR 54 7 
30 T+L D + +G + L+ + + R +DEGQ L G Q R 

Sbjct: 312 TSLVTDAIHEGKLTLQIHNARTSDEGQYRCLFGKDGVYQEAR 353 

Score = 97 (34.1 bits). Expect - 2.9e-23, Sum P(2) = 2.9e-23 
35 Identities = 25/88 (28%), Positives = 44/88 (50%), Frame =» +2 

Query: 2 45 PEDPWALVGTDATLCCS FSPE PG FS LAQLNL I WQLTDTKQLVHS FAEGQD - QG SAY 412 

P P-n-A VG DA L C P+ + A «■ ♦ W +D V + +G + G Y 
Sbjct: 34 PNLPI LAKVGEDALLTCQLLPKR- - TTAHMEVRVTYRSDPDMP VI MYRDGAEVTGLPMEGY 91 

Query; 413 ANRT AL F LD LLAQGN A SLR LQ S VR VAD EGQ 502 

R D + + L+ + + V+ +D+GQ 

Sbjct: 92 GGRAEWMEDSTEEGSVALKIRQVQPSDDGQ 121 



45 Butyrophilin is thought to be important in the process of lactation and milk secretion. 
Based on the sequence similarity, the translation product of this clone is expected to 
share at least some biological activities with butyrophilin and/or oligodendrite 
proteins. Such activities are known in the art, some of which are described elsewhere 
herein. 

50* In another embodiment, polypeptides comprising the amino acid sequence of 

the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise, or 
alternatively consist of, the following amino acid sequence: 
ARLGRVPESQSRRGAAGAAFHHG 

55 GALW 

FCLTGALEVQVPEDPWALVGTDATLCCSFSPEPGFSLAQLNLIWQLTDTKQL 
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VHSFAEGQDQGSAYANRTALFLDLLAQGNASLRLQSVRVADEGQLHLLREH 
PGFRQRCRQPAGGRSLLEAQHDPGAQQGPAARGTW (SEQ ID NO: 155). 
Polynucleotides encoding these polypeptides are also encompassed by the invention. 

In specific embodiments, polypeptides of the invention comprise, or 
alternatively consist of, the following amino acid sequence: 

PWSPTRTCGPGDMVTITCSSYQGYPEAEVFWQDGQGVPLTGNVTTSQMANE 

QGLFDVHSILRVVLGANGTYSCLVRNPVLQQDAHSSVTITPQRSPTGAVEVQ 

VPEDPWALVG'TDA'rLHCSFSPEPGFSLTQLNLIWQLTDTKQLVHSFTEGRDQ 

GSAYANRTALFPDLLAQGNASLRLQRVRVADEGSFTCFVSIRDFGSAAVSLQ 

VAAPYSKPSMTLEPNKDLRPGDTVTITCSSYRGYPEAEVFWQDGQGVPLTGN 
VTTSQMANEQGLFDVHSVLRVVLGANGTYSCLVR 

NPVLQQDAHGSVTITGQPMTFPPEALW^WGL^VCLIALLVALPFVCWTIKIK 
QSCEEENAGAEDQDGEGE GSKTALQPLKHSDSKEDDGQEIA (SEQ ID NO: 
156). Moreover, fragments and variants of these polypeptides (such as, for example, 
fragments as described herein, polypeptides at least 80%, 85%, 90%, 95%, 96%, 
97%, 98%, or 99% identical to these polypeptides and polypeptides encoded by the 
polynucleotide which hybridizes, under stringent conditions, to the polynucleotide 
encoding these polypeptides , or the complement there of are encompassed by the 
invention. Antibodies that bind polypeptides of the invention are also encompassed by 
the invention. Polynucleotides encoding these polypeptides are also encompassed by 
the invention. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
15. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 15. 

This gene is expressed primarily in dendritic cells and to a lesser extent in 
fetal liver and spleen, normal colon, and normal liver. It is also expressed in various 
tumors including ovary, glioblastoma, germ cell tumors, pancreatic tumor, and 
germinal center B-cell cancer. 

Polynucleotides and polypeptides of the invention are useful as reagents for 
differential identification of the tissue(s) or cell type(s) present in a biological sample 
and for diagnosis of diseases and conditions which include but are not limited to 
cancer and immune disorders including autoimmune diseases and immuno-deficiency 
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disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues or cell types (e.g., cancerous 
and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid and 
spinal fluid) or another tissue or sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise, or alternatively 
consist of, one or more immunogenic epitopes shown in SEQ ED NO: 84 as residues: 
Glu-72 to Gly-77, Arg-1 15 to Arg-125, His-138 to Pro-146. Polynucleotides encoding 
said polypeptides are also encompassed by the invention. 

The dendritic cell distribution and homology to the butyrophilin family 
indicates that polynucleotides and polypeptides corresponding to this gene are useful 
for down-regulation or stimulation of the immune-response. Dendritic cells play a 
pivotal role in immune surveillance- they are responsible for the capture and 
processing of antigens from the periphery and subsequent presentation of these 
antigens to B and T lymphocytes in lymphoid organs. Dendritic cells also produce 
and secrete numerous immunomodulatory proteins. The butyrophilin family appears 
to have a receptor like structure having an extracellular domain, transmembrane 
domain and intracellular region! The encoded protein may act as a membrane bound 
receptor to mediate the interaction of dendritic cells with other cells of the immune 
system. This interaction could be with either soluble factors produced by other 
immune cells or with membrane proteins present on other immune cells. Such 
interactions may result in a stimulation or down-regulation of dendritic cell function. 
Subsequently the immune system may be stimulated to respond against specific 
antigens, or the response may dampened as is seen in tolerance of self- antigens. The 
inability to effectively inhibit immune responses to self antigens could result in auto- 
immune disease. Conversely the inability to stimulate correct responses could result 
in an immuno-deficiency syndrome and subsequent susceptibility to infectious agents. 
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Additionally, the expression of this gene in numerous tumors may reflect the 
role that this molecule plays in the body's normal anti-tumor surveillance system; 
tumor cells may express this protein in order to stimulate an immune response (e.g.; 
targeting of cytotoxic T-cells against the tumor cells). Alternately, the molecule may 
5 be used by tumors to dampen the cytotoxic immune response and thus be a means by 
which tumors escape killing. 

Moreover, the tissue distribution in fetal liver spleen and germinal center B- 
ceil indicates the protein product of this clone is useful for the diagnosis and treatment 
of a variety of immune system disorders. Representative uses are described in the 
"Immune Activity" and "Infectious Disease" sections below, in Example 1 1, 1 3, 14, 
16, 18, 19, 20, and 27, and elsewhere herein. Briefly, the expression of this gene 
product indicates a role in regulating the proUferation; survival; differentiation; and/or 
activation of hematopoietic cell lineages, including blood stem cells. This gene 
product is involved in the regulation of cytokine production, antigen presentation, or 
other processes suggesting a usefulness in the treatment of cancer (e.g. by boosting 
immune responses). Since the gene is expressed in cells of lymphoid origin, the 
natural gene product is involved in immune functions. Therefore it is also useful as an 
agent for immunological disorders including arthritis, asthma, immunodeficiency 
diseases such as AIDS, leukemia, rheumatoid arthritis, granulomatous disease, 
inflammatory bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, 
hypersensitivities, such as T-cell mediated cytotoxicity; immune reactions to 
transplanted organs and tissues, such as host-versus-graft and graft-versus-host 
diseases, or autoimmunity disorders, such as autoimmune infertility, lense tissue 
injury, demyelination, systemic lupus erythematous, drug induced hemolytic anemia 
rheumatoid arthritis, Sjogren's disease, and scleroderma. Moreover, the protein may ' 
represent a secreted factor that influences the differentiation or behavior of other 
blood cells, or that recruits hematopoietic cells to sites of injury. Thus, this gene 
product is thought to be useful in the expansion of stem cells and committed 
progenitors of various blood lineages, and in the differentiation and/or proliferation of 
various cell types. Furthermore, the protein may also be used to determine biological 
activity, raise antibodies, as tissue markers, to isolate cognate ligands or receptors to 
identify agents that modulate their interactions, in addition to its use as a nutntional 
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3 



supplement. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ED NO: 12 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded mom the scope of the present invention. To list every related sequence 
would be cumbersome. Accordingly, preferably excluded from the present invention 
are one or more polynucleotides comprising a nucleotide sequence described by the 
general formula of a-b, where a is any integer between 1 to 3422 of SEQ ID NO: 12, b 
is an integer of 15 to 3436, where both a and b correspond to the positions of 
nucleotide residues shown in SEQ ID NO: 12, and where b is greater than or equal to a 



14. 



i FEATURES OF PROTEIN ENCODED BY GENE NO: 3 

The translation product of this gene shares sequence homology with matrilin 
and other cartilage matrix proteins (see, e.g, Genbank Accession Nos. 
emb|CAA06889.1| (AJ006I40); and/or emb|CAA30915.1|; all references available 
through these accessions are hereby incorporated in their entirety by reference 
herein). Matrilins are members of a superfamily with von Willebrand factor type A- 
like modules, which is thought to be important in forming an extracellular, 
filamentous network. 

Moreover, the translation product of this gene also shares sequence homology 
with the kidney injury associated molecule (KIM) protein (See Geneseq Accession 
No. W86326; all references and information available through this accession are 
hereby incorporated herein by reference). Based on the sequence similarity, the 
translation product of this clone is expected to share at least some biological activities 
with matrilin, cartilage matrix proteins and KIM proteins. Such activities are known 
in the art, some of which are described elsewhere herein. 

In specific embodiments, polypeptides of the invention comprise, or 
alternatively consist of, an amino acid sequence selected from the group: 
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KXPCXYRSGffGSTHASVPSAPRPSRAMLPWTA^GLALSLRLALARSGAERG 
' PPASAPRGDLMFLLDSSASVSHYEFSRVREFVGQLVAPLPLGTGALRASLVHV 

GSRPYTEFPFGQHSSGEAAQDAVRASAQRMGDTHTGLALVYAKEQLFAEAS 
V GARPGVPKVLVWVTDGGSSDPVGPPMQELKDLGVTVFIVSTGRGNFLELSAA 
1 U^PAE^iL ^J^DLHnVQELRGSlL DAMRP (SEQ ID NO: 1 5 9)\y " 

APAWGGPQGRWSRHLSPTPALWAPLAGHLMLQQTAVPWHRPAPGQCGCHP 

CAGQKHAPHPGQPHPSCAGRRGTRCMADCPRAPDWHAGPRCPGAVEPPAAP 

QTPEPGRTRSERRWLSCPAGTSGPLGGLMLVDRAPRRSAPAPAASSGPGRXPS 
J^GASRA RDGARSARTRGSTREFRTGXCRVXSX rSF ; OJTjJ^ 

J hasvpsaprpsramu>wtalglalslrlalarsgaergppasaprgdlmfll\ 
/ dssasvshyefsrvrefvgqlvaplplgtgalraslvhvgsrpytefpfgqhs\ 
/ sgeaaqdavrasaqrmgdthtglalvyakeqlfaeasgarpgvpkvlvwv / 

{ TDGGSSDPVGPPMQELKDLGVTVFIVSTGRG^FLEJ^ } 
VvDDLHWO^LRGS^ 

ID NO: 161), GALRASLVHVGSRP (SEQ ID NO: 162), GVPKVLVWVTDG (SEQ 
ID NO: 163), and VGPPMQELKDLGVT (SEQ ED NO: 164). Moreover, fragments 
and variants of these polypeptides (such as, for example, fragments as described 
herein, polypeptides at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% 
identical to these polypeptides, or polypeptides encoded by a polynucleotide which 
hybridizes, under stringent conditions, to the polynucleotide encoding these 
polypeptides) are encompassed by the invention. Antibodies that bind polypeptides of 
the invention and polynucleotides encoding these polypeptides are also encompassed 
by the invention. 

This gene is expressed primarily in uterus, brain, lung, colon, kidney, 
placenta, dendritic cells. 

Polynucleotides and polypeptides of the invention are useful as reagents for 
differential identification of the tissue(s) or cell type(s) present in a biological sample 
and for diagnosis of diseases and conditions which include but are not limited to: 
renal, neural, endothelial, developmental, and reproductive diseases and/or disorders, 
particularly disorders resulting from tissue structural damages or abnormalities, 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
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type(s). For a number of disorders of the above tissues or cells, particularly of the 
uterus, placenta, kidney, lung, brain, and colon, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues or cell 
types (e.g., renal, neural, endothelial, developmental, reproductive, and cancerous and 
5 wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal 
fluid) or another tissue or sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy 
tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution kidney, combined with the homology to the matrilin 
10 and KIM proteins indicates that polynucleotides and polypeptides corresponding to 
this gene would be useful for treatment, prevention, detection and/or diagnosis of 
disorders involving tissues with structural damages or abnormalities, particularly 
organs or tissues such as uterus, placenta, kidney, lung, brain, and colon. Matrilin 
may be also involved in extracellular transport, storage, barrier of molecular factors 
1 5 such as growth factors, hormones, thereby modulating the organ functions. 

Representative uses are described in the "Biological Activity", "Hyperproliferative 
Disorders", "Infectious Disease", and "Regeneration" sections below, in Example 11, 
19, and 20, and elsewhere herein. 

In addition expression in the placenta indicates that polynucleotides and/or 
20 polypeptides corresponding to this gene would be useful in treating, preventing, 
detecting and/or diagnosing placental related function or diseases, e.g. induced 
abortion or spontaneous abortion; hyperplastic abnormalities; factors involved in 
circulation, nutrient transport; prevention of multiple gestation; gestational 
trophoblastic diseases, such as hydatidiform mole as well as placental site 
25 trophoblastic tumor and chriocarcinoma; uterus related fiinction, e.g., disorders during 
the menstrual cycle or pregnancy, inflammatory changes, such as pyometra, 
endometritis and dysfunctional bleeding; contraceptives, abortion and birth control; 
infertility caused by blastocyst, embryo or fetus implantation problems; utilities in 
surrogate pregnancy; tumors or hyperplasia of the uterus, with epithelium, stroma or 
30 smooth muscle origins; brain related functions, e.g., trauma, congenital 
malformations, spinal cord injuries, ischemia and infarction, aneurysms, 
hemorrhages, toxic neuropathies induced by neurotoxins, inflammatory diseases such 
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as meningitis and encephalitis, demyelinating diseases, neurodegenerative diseases 
such as Parkinson's disease, Huntington's disease, Alzheimer's disease, peripheral 
neuropathies, multiple sclerosis, neoplasia of neuroectodermal origin, etc; as well as 
diseases implicated in lung, colon functions. Polynucleotides and/or polypeptides of 
5 the invention can be used to promote growth and/or survival of damaged tissue (e.g., 
renal tissue), since KIM proteins are upregulated in injured or regenerating (especially 
renal) tissues. Fusion proteins of the invention, conjugates, antibodies and vectors can 
also be used therapeutically, e.g., these or KIM proteins (or a protein having KIM 
activity) may be included with an acceptable carrier in pharmaceutical compositions, 
10 useful for therapy/prophylaxis of conditions associated with 

dysfunction/dysregulation of genes or proteins of the invention, especially renal 
diseases or impairments of renal function in humans (e.g., acute renal failure, acute 
nephritis). The polynucleotides can be used to produce antisense sequences which, 
when internalized into cells, can disrupt expression of a cellular gene, also useful in 
1 5 therapy (e.g., to block the growth of tumors dependent on polynucleotides or 
polypeptides of the invention for growth) or compositions. The proteins and 
polynucleotides would be useful diagnostically e.g., to detect and quantify renal 
injury/disease (indicative of increased risk, or presence of, renal injury or impaired 
function), or abnormal responses to tissue injury (indicative of increased risk, or 
20 presence of, an autoimmune response or abnormal tissue growth arising 

from/affecting renal tissue). The proteins can also be used to locate cells producing 
the invention (especially specific loci, e.g., tissue masses abnormally 
producing/expressing polynucleotide or polypeptides of the invention such as tumors 
arising from/affecting renal tissue), by contacting cells with an imaginable reagent 
25 which binds to polynucleotides or polypeptides of the invention and imaging reagent 
accumulation. Furthermore, the protein may also be used to determine biological 
activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, 
to identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as, antibodies directed against the protein may show 
30 utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 
Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
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related to SEQ ID NO: 13 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence 
would be cumbersome. Accordingly, preferably excluded from the present invention 
5 are one or more polynucleotides comprising a nucleotide sequence described by the 
general formula of a-b, where a is any integer between 1 to 720 of SEQ ID NO: 1 3, b 
is an integer of 15 to 734, where both a and b correspond to the positions of 
nucleotide residues shown in SEQ ID NO: 1 3, and where b is greater than or equal to a 
+ 14. 

10 

FEATURES OF PROTEIN ENCODED BY GENE NO: 4 

The translation product of this gene shares sequence homology with Liv-1 
which is thought to be an estrogen-regulated gene associated with breast cancer. The 
1 5 polypeptide of this gene has been determined to have seven transmembrane domains 
at about amino acid positions 3-19, 400-436, 433-457, 493-512, 736-753, 758-781, 
and/or 800-827 of the amino acid sequence referenced in Table 1 for this gene. Based 
upon these characteristics, it is believed that the protein product of this gene shares 
structural features to type ma membrane proteins. 
20 Included in this invention as preferred domains are zinc finger, C2H2 type, 

and cytochrome c family heme-binding site signature domains, which were identified 
using the ProSite analysis tool (Copyright, Swiss Institute of Bioinformatics). 'Zinc 
finger' domains [1-5] are nucleic acid-binding protein structures first identified in the 
Xenopus transcription factor TFIIIA. These domains have since been found in 
25 numerous nucleic acid-binding proteins. 

A zinc finger domain is composed of 25 to 30 amino-acid residues. There are 
two cysteine or histidine residues at both extremities of the domain, which are 
involved in the tetrahedral coordination of a zinc atom. It has been proposed that such 
a domain interacts with about five nucleotides. 
30 A schematic representation of a zinc finger domain is shown below: 

xxxxxxxxxxxxCHx/xxZnxx/xCHxxxxxxxxxx 
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Gin Leu Val His Ser Phe Ala Glu Gly Gin Asp Gin Gly Ser Ala Tvr 
65 70 75 80 

Ala Asn Arg Thr Ala Leu Phe Leu Asp Leu Leu Ala Gin Gly Asn Ala 
85 90 95 

Ser Leu Arg Leu Gin Ser Val Arg Val Ala Asp Glu Gly Gin Leu His 
100 105 no 

Leu Leu Arg Glu His Pro Gly Phe Arg Gin Arg Cys Arg Gin Pro Ala 
115 120 125 

Gly Gly Arg Ser Leu Leu Glu Ala Gin His Asp Pro Gly Ala Gin Gin 
130 135 140 

Gly Pro Ala Ala Arg Gly Thr Trp 
145 150 

<210> 85 
<211> 215 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> <7) 

<223> Xaa equals any of the naturally occurring L-amino acids 



<400> 85 

Met Leu Pro Trp Thr Ala Xaa Gly Leu Ala Leu Ser Leu Arg Leu Ala 
. -A- 5 10 15 



Leu Ala Arg Ser Gly Ala Glu Arg Gly Pro Pro Ala Ser Ala Pro Aro 
20 25 30 

Gly Asp Leu Met Phe Leu Leu Asp Ser Ser Ala Ser Val Ser His Tyr 
35 40 45 

Glu Phe Ser Arg Val Arg Glu Phe Val Gly Gin Leu Val Ala Pro Leu 
50 55 60 

Pro Leu Gly Thr Gly Ala Leu Arg Ala Ser Leu Val His Val Gly Ser 
65 70 75 80 

Arg Pro Tyr Thr Glu Phe Pro Phe Gly Gin His Ser Ser Gly Glu Ala 
85 90 95 

Ala Gin Asp Ala Val Arg Ala Ser Ala Gin Arg Met Gly Asp Thr His 
100 105 HO 

Thr Gly Leu Ala Leu Val Tyr Ala Lys Glu Gin Leu Phe Ala Glu Ala 
115 120 125 

Ser Gly Ala Arg Pro Gly Val Pro Lys Val Leu Val Trp Val Thr Asd 
130 135 140 

Gly Gly Ser Ser Asp Pro Val Gly Pro Pro Met Gin Glu Leu Lys Asd 
145 150 155 160 

Leu Gly Val Thr Val Phe He Val Ser Thr Gly Arg Gly Asn Phe Leu 
165 170 ~ ~ 175 

Glu Leu Ser Ala Ala Ala Ser Ala Pro Ala Glu Lys His Leu His Phe 
180 185 190 
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Val Asp Val Asp Asp Leu His lie He Val Gin Glu Leu Arg Gly Ser 
195 200 205 



He Leu Asp Ala Met Arg Pro 
210 215 



<210> 86 
<211> 831 
<212> PRT 

<213> Homo sapiens 



<40O> 86 

Met Lys Val His Met His Thr Lys Phe Cys Leu lie Cys Leu Leu Thr 
5 io 



15 



Phe He Phe His His Cys Asn His Cys His Glu Glu His Asp His Glv 
20 25 30 

Pro Glu Ala Leu His Arg Gin His Arg Gly Met Thr Glu Leu Glu Pro 
3 5 40 45 

Ser Lys Phe Ser Lys Gin Ala Ala Glu Asn Glu Lys Lys Tyr Tyr He 

55 60 

Glu Lys Leu Phe Glu Arg Tyr Gly Glu Asn Gly Arg Leu. Ser Phe Phe 
65 70 75 80 

Gly Leu Glu Lys Leu Leu Thr Asn Leu Gly Leu Gly Glu Arg Lys Val 
fl 5 90 95 

Val Glu lie Asn His Glu Asp Leu Gly His Asp His Val Ser His Leu 

105 no 

Asp lie Leu Ala Val Gin Glu Gly Lys His Phe His Ser His Asn His 

120 12 5 

Gin His Ser His Asn His Leu Asn Ser Glu Asn Gin Thr Val Thr Ser 
iJU 135 140 

Val Ser Thr Lys Arg Asn His Lys Cys Asp Pro Glu Lys Glu Thr Val 

155 160 

Glu Val Ser Val Lys Ser Asp Asp Lys His Met His Asp His Asn His 

5 170 ~ 175 

Arg Leu Arg His His His Arg Leu His His His Leu Asp His Asn Asn 
180 i 85 19Q 

Thr His His Phe His Asn Asp Ser He Thr Pro Ser Glu Arg Gly Glu 

Pro Ser Asn Glu Pro Ser Thr Glu Thr Asn Lys Thr Gin Glu Gin Ser 
<ao 215 220 

Asp val Lys Leu Pro Lys Gly Lys Arg Lys Lys Lys Gly Arg Lys Ser 
" 3 230 235 240 

Asn Glu Asn Ser Glu Val lie Thr Pro Gly Phe Pro Pro Asn His Asp 
245 250 255 

Gin Gly Glu Gin Tyr Glu His Asn Arg Val His Lys Pro Asp Arg Val 

^ DU 265 270 

His Asn Pro Gly His Ser His Val His Leu Pro Glu Arg Asn Gly His 



